Fast Lempel-Ziv Decompression in Linear Space
نویسندگان
چکیده
We consider the problem of decompressing the Lempel-Ziv 77 representation of a string S ∈ [σ] using a working space as close as possible to the size z of the input. The folklore solution for the problem runs in optimal O(n) time but requires random access to the whole decompressed text. A better solution is to convert LZ77 into a grammar of size O(z log(n/z)) and then stream S in optimal linear time. In this paper, we show that O(n) time and O(z) working space can be achieved for constant-size alphabets. On larger alphabets, we describe (i) a trade-off achieving O(n log σ) time and O(z log1−δ σ) space for any 0 ≤ δ ≤ 1, and (ii) a solution achieving optimal O(n) time and O(z log logn) space. Our solutions can, more generally, extract any specified subsequence of S with little overheads on top of the optimal running time and working space. As an immediate corollary, we show that our techniques yield improved results for pattern matching problems on LZ77-compressed text.
منابع مشابه
Byte pair encoding : a text compression scheme that accelerates pattern matching
Byte pair encoding (BPE) is a simple universal text compression scheme. Decompression is very fast and requires small work space. Moreover, it is easy to decompress an arbitrary part of the original text. However, it has not been so popular since the compression is rather slow and the compression ratio is not as good as other methods such as Lempel-Ziv type compression. In this paper, we bring ...
متن کاملLinear Time Lempel-Ziv Factorization: Simple, Fast, Small
Computing the LZ factorization (or LZ77 parsing) of a string is a computational bottleneck in many diverse applications, including data compression, text indexing, and pattern discovery. We describe new linear time LZ factorization algorithms, some of which require only 2n log n + O(log n) bits of working space to factorize a string of length n. These are the most space efficient linear time al...
متن کاملA decompression pipeline for accelerating out-of-core volume rendering of time-varying data
This paper presents a decompression pipeline capable of accelerating out-of-core volume rendering of time-varying scalar data. Our pipeline is based on a twostage compression method that cooperatively uses the CPU and GPU (graphics processing unit) to transfer compressed data entirely from the storage device to the video memory. This method combines two different compression algorithms, namely ...
متن کاملA Lempel-Ziv Compressed Structure for Document Listing
Document listing is the problem of preprocessing a set of sequences, called documents, so that later, given a short string called the pattern, we retrieve the documents where the pattern appears. While optimal-time and linear-space solutions exist, the current emphasis is in reducing the space requirements. Current document listing solutions build on compressed suffix arrays. This paper is the ...
متن کاملEfficient Text Compression Using Special Character Replacement and Space Removal
In this paper, we have proposed a new concept of text compression/decompression algorithm using special character replacement technique. Moreover after the initial compression after replacement of special characters, we remove the spaces between the words in the intermediary compressed file in specific situations to get the final compressed text file. Experimental results show that the proposed...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1802.10347 شماره
صفحات -
تاریخ انتشار 2018